Cache Misses Prediction for High Performance Sparse Algorithms
نویسندگان
چکیده
Many scientiic applications handle compressed sparse matrices. Cache behavior during the execution of codes with irregular access patterns, such as those generated by this type of matrices, has not been widely studied. In this work a probabilistic model for the prediction of the number of misses on a direct mapped cache memory considering sparse matrices with an uniform distribution is presented. As an example of the potential usability of such types of models, and taking into account the state of the art with respect to high performance superscalar and/or su-perpipelined CPUs with a multilevel memory hierarchy, we have modeled the cache behavior of an optimized sparse matrix-dense matrix product algorithm including blocking at the memory and register levels.
منابع مشابه
A Cache-Optimal Alternative to the Unidirectional Hierarchization Algorithm
The sparse grid combination technique provides a framework to solve high-dimensional numerical problems with standard solvers by assembling a sparse grid from many coarse and anisotropic full grids called component grids. Hierarchization is one of the most fundamental tasks for sparse grids. It describes the transformation from the nodal basis to the hierarchical basis. In settings where the co...
متن کاملDirect mapped cache performance modeling for sparse matrix operations
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits both operations and memory savings, generates irregular access patterns, reducing the performance of the memory hierarchy. In this work we present a probabilistic model for the prediction of the number of misses of a direct mapped cache memory, considering sparse matrices with a uniform entries ...
متن کاملLocality of Reference in Sparse Cholesky Factorization Methods
Abstract. This paper analyzes the cache efficiency of two high-performance sparse Cholesky factorization algorithms: the multifrontal algorithm and the left-looking algorithm. These two are essentially the only two algorithms that are used in current codes; generalizations of these algorithms are used in general-symmetric and general-unsymmetric sparse triangular factorization codes. Our theore...
متن کاملModeling Set Associative Caches Behaviour for Irregular Computations
While much work has been devoted to the study of cache behavior during the execution of codes with regular access patterns, little attention has been paid to irregular codes. An important portion of these codes are scientiic applications that handle compressed sparse matrices. In this work a probabilistic model for the prediction of the number of misses on a K-way associative cache memory consi...
متن کاملModeling of L2 Cache Behavior for Thread-Parallel Scientific Programs on Chip Multi-Processors
It is critical to provide high performance for scientific programs running on a Chip MultiProcessor (CMP). A CMP architecture often has a shared L2 cache and lower storage hierarchy. The shared L2 cache can reduce the number of cache misses if the data are commonly shared by several threads, but it can also lead to performance degradation due to resource contention. Sometimes running threads on...
متن کامل